Goto

Collaborating Authors

 email network


Topic-Partitioned Multinetwork Embeddings

Neural Information Processing Systems

We introduce a new Bayesian admixture model intended for exploratory analysis of communication networks--specifically, the discovery and visualization of topic-specific subnetworks in email data sets. Our model produces principled visualizations of email networks, i.e., visualizations that have precise mathematical interpretations in terms of our model and its relationship to the observed data. We validate our modeling assumptions by demonstrating that our model achieves better link prediction performance than three state-of-the-art network models and exhibits topic coherence comparable to that of latent Dirichlet allocation. We showcase our model's ability to discover and visualize topic-specific communication patterns using a new email data set: the New Hanover County email network. We provide an extensive analysis of these communication patterns, leading us to recommend our model for any exploratory analysis of email networks or other similarly-structured communication data. Finally, we advocate for principled visualization as a primary objective in the development of new network models.


Machine learning: Disrupting the cyber security industry

#artificialintelligence

Despite the emergence of apps like Slack and Yammer for internal employee communication, email is still the dominant form of external employee communication for enterprises. "In a similar way that computers, servers and devices communicate with one another through data packets transmitted via TCP/IP, employees communicate with one another through natural language and documents shared via email," says Bishop. Why are account takeovers on the rise? And how can businesses prevent this method of attack? Asaf Cidon, from Barracuda Networks, helps Information Age answer these questions. "When email was created in the early 1970s it was the first'killer app' for the web.


Tensorial and bipartite block models for link prediction in layered networks and temporal networks

Tarres-Deulofeu, Marc, Godoy-Lorite, Antonia, Guimera, Roger, Sales-Pardo, Marta

arXiv.org Machine Learning

Imagine a team of researchers looking for promising drug combinations to treat a specific cancer type for which current treatments are ineffective. The team has data on the effect of certain pairs of drugs on other cancer types, but the data are very sparse--only a few drug pairs have been tested on each cancer type, and each drug pair is tested in a few cancer types, at best, or has never been tested at all. The challenge is to select the most promising drug pairs for testing with the target cancer type, so as to minimize the cost associated to unsuccessful tests. We can formalize this challenge as the following inference problem: We have a partial observation of the pairwise interactions between a set of nodes (drugs) in different "network layers" (cancer types), and we need to infer which are the unobserved interactions within each layer (drug interactions in each cancer type). This challenge is relevant for the many systems that can be represented as multilayer networks [1-4], and is also formally analogous to the challenge of predicting the existence of interactions between nodes in time-resolved networks [5-11]. For instance, we would face the same situation if we had data about the daily email or phone communications between users, and wanted to infer the existence of interactions between pairs of users on a certain unobserved day; in this case each layer would be a different day. Here, we introduce new generative models that are suitable to address the challenge above. We model all layers concurrently, so that our approach takes full advantage of the information contained in all layers to make predictions for any one of them.


Higher-order clustering in networks

Yin, Hao, Benson, Austin R., Leskovec, Jure

arXiv.org Machine Learning

A fundamental property of complex networks is the tendency for edges to cluster. The extent of the clustering is typically quantified by the clustering coefficient, which is the probability that a length-2 path is closed, i.e., induces a triangle in the network. However, higher-order cliques beyond triangles are crucial to understanding complex networks, and the clustering behavior with respect to such higher-order network structures is not well understood. Here we introduce higher-order clustering coefficients that measure the closure probability of higher-order network cliques and provide a more comprehensive view of how the edges of complex networks cluster. Our higher-order clustering coefficients are a natural generalization of the traditional clustering coefficient. We derive several properties about higher-order clustering coefficients and analyze them under common random graph models. Finally, we use higher-order clustering coefficients to gain new insights into the structure of real-world networks from several domains.


Topic-Partitioned Multinetwork Embeddings

Krafft, Peter, Moore, Juston, Desmarais, Bruce, Wallach, Hanna M.

Neural Information Processing Systems

We introduce a new Bayesian admixture model intended for exploratory analysis ofcommunication networks--specifically, the discovery and visualization of topic-specific subnetworks in email data sets. Our model produces principled visualizations ofemail networks, i.e., visualizations that have precise mathematical interpretations in terms of our model and its relationship to the observed data. We validate our modeling assumptions by demonstrating that our model achieves better link prediction performance than three state-of-the-art network models and exhibits topic coherence comparable to that of latent Dirichlet allocation. We showcase our model's ability to discover and visualize topic-specific communication patternsusing a new email data set: the New Hanover County email network. We provide an extensive analysis of these communication patterns, leading us to recommend our model for any exploratory analysis of email networks or other similarly-structured communication data. Finally, we advocate for principled visualization asa primary objective in the development of new network models.